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O ■ ABSTRACT 

JO . Context. The Planck satellite will map the full sky at nine frequencies from 30 to 857 GHz. The CMB intensity and polarization that 

' are its prime targets are contaminated by foreground emission. 

^ ' Aims. The goal of this paper is to compare proposed methods for separating CMB from foregrounds based on their different spectral 

I and spatial characteristics, and to separate the foregrounds into 'components' with different physical origins (Galactic synchrotron, 

^ , free-free and dust emissions; extra-galactic and far-IR point sources; Sunyaev-Zeldovich effect, etc). 

T '~j ■ Methods. A component separation challenge has been organised, based on a set of realistically complex simulations of sky emission. 

' Several methods including those based on internal template subtraction, maximum entropy method, parametric method, spatial and 

\-i \ harmonic cross correlation methods, and independent component analysis have been tested. 

Results. Different methods proved to be effective in cleaning the CMB maps of foreground contamination, in reconstructing maps 
of diffuse Galactic emissions, and in detecting point sources and thermal Sunyaev-Zeldovich signals. The power spectrum of the 
residuals is, on the largest scales, four orders of magnitude lower than the input Galaxy power spectrum at the foreground minimum. 
The CMB power spectrum was accurately recovered up to the sixth acoustic peak. The point source detection limit reaches 100 miy, 
and about 2300 clusters are detected via the thermal SZ effect on two thirds of the sky. We have found that no single method performs 
best for all scientific objectives. 

Conclusions. We foresee that the final component separation pipeline for Planck will involve a combination of methods and iterations 
between processing steps targeted at different objectives such as diffuse component separation, spectral estimation, and compact 
source extraction. 

Key words. Methods: data analysis; Cosmology: cosmic microwave background 
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1. Introduction 

Planck is a European Space Agency space mission whose 
main objective is to measure the cosmic microwave background 
(CMB) temperature and polarization anisotropics with high ac- 
curacy, high angu lar resolution and with unpre cedented fre- 
quency coverage (IThe Planck Collaborationll2005h . In anticipa- 
tion of the launch, Planck is stimulating much research and 



development into data processing methods that are capable of 
addressing the ambitious science programme enabled by these 
multi-frequency observations. It is expected that Planck will 
break new ground in studies of the CMB, of the interstellar 
medium and Galactic emission mechanisms on scales down to 
a few arcminutes, as well as of the emission from many extra- 
galactic objects. 
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The processing of such multi-frequency data depends on 
both the science goals, as well as on the signal to noise regime 
and on the overall level and complexity of foreground contam- 
ination. This observation is borne out by a brief historical per- 
spective on CMB data processing. 

An example of the low foreground level and complexity 
regime is provided by the observations made by Boomerang at 
145, 245 and 345 GHz (Masi et al. 2006), which targeted a re- 
gion of sky with low emission from a single dust foreground. 
Here the two higher frequency channels acted as foreground 
monitors for the 145 GHz CMB deep survey, and were used 
to estimate that the foreground contamination at 145 GHz was 
at an RM S level of less tha n lOjuK on angular scales of 11.5' 
(Table 10. iMasi et all 120061) ). The 145 GHz CMB maps were 
then used for the purpose of power spectrum estimation in both 
temperature and polarization, after masking away a handful of 
compact sources ( Jones et"al]|2006t) . lMasi et al. (2006) estimate 
that the cleanest 40% of the sky have a level of dust brightness 
fluctuations similar to those of the Boomerang observations, and 
that the cleanest 75% of the sky have brightness fluctuations less 
than three times larger 

An example of the high foreground level and complex- 
ity regime is available with the all-sky observations of the 
WMAP mission in five frequency channels from 23 to 94 GHz 
jBennett et alj l2003at [Hinshaw et al.l 12008 ). In this frequency 
range, the emission from at least three Galactic components 
(synchrotron, free-free and dust), as well as contamination by 
unresolved point sources must be contended with. WMAP also 
gives a clear example of science goal dependent data process- 
ing: CMB maps for use in non-Gaussianity tests are obtained 
from a noise-weighted sum of frequency maps at differing angu- 
lar resolution, for whi ch the regions most c ontaminated by fore- 
grounds are masked ( Komatsu et al.l 120031) ; The analysis lead- 
ing to the WMAP cosmological parameter estimation involves 
foreground cleaning by template subtraction, masking of the 
most contaminated 15% of sky, and subtracting a model of the 
contribution o f unresolved point sources from the CMB cross 
power spectra dHinshaw et alj|2003ll2007h . For an improved un- 
derstanding of galactic emission, the WMAP team have used a 
number of methods including template fits, internal linear com- 
bination (ILC), the maximum entropy meth od, and the direct 
pixel-by-pixel fitting of an emission model (iGold etalJ 120081 
iDunklev et aLll2008l) . 

1.1. Component Separation 

Component separation is a catch-all term encompassing any data 
processing that exploits correlations in observations made at sep- 
arate frequencies, as well as external constraints and physical 
modeling, as a means of distinguishing between different physi- 
cal sources of emission. 

Planck has a number of different scientific objectives: the 
primary goal is a cosmological analysis of the CMB, but im- 
portant secondary goals include obtaining a better understand- 
ing of the interstellar medium and Galactic emission, measure- 
ment of extragalactic sources of emission and the generation of 
a Sunyaev-Zeldovich (SZ) cluster catalogue. These planned ob- 
jectives will lead to a set of data products which the Planck con- 
sortium is committed to delivering to the wider community some 
time after the completion of the survey. These data products in- 
clude maps of the main diffuse emissions and catalogues of ex- 
tragalactic sources, such as galaxies and clusters of galaxies. 

In this context, it is worth remembering that Planck is de- 
signed to recover the CMB signal at the level of a few mi- 



crokelvin per resolution element of 5' (and less than one mi- 
crokelvin per square degree). Numbers to keep in mind are the 
RMS of CMB smoothed with a beam of 45' FWHM, which is 
around 70yuK, while the RMS of white noise at the same scale is 
around 0.7yuK. This level of sensitivity sets the ultimate goals for 
data processing — and component separation in particular — if 
the full scientific potential of Planck is to be realised. However, 
less stringent requirements may be acceptable for statistical anal- 
yses such as power spectrum estimation, in particular on large 
scales where cosmic variance dominates the error of total inten- 
sity observations. 

1.2. WG2 

Planck is designed to surpass previous CMB experiments in 
almost every respect. Therefore, a complete and timely ex- 
ploitation of the data will require methods that improve upon 
foreground removal via template subtraction and masking. The 
development and assessment of such methods is coordinated 
within the Planck 'Component Separation Working Group' 
(WG2). Another working group in the Planck collaboration, 
the Ce temperature and polarization working group (WG3 or 
"CTP"), investigates other critical data analysis steps, in partic- 
ular, map-making (Poutanen et al. 2006; Ashd own et al.l |2007|) 
and power spectrum estimation. 

The present paper reports the results of the WG2 activity 
in the framework of a component separation challenge using a 
common set of simulated Planck dataQln turn, this exercise pro- 
vides valuable feedback and validation during the development 
of the Planck Sky Model. 

This is the first time, within the Planck collaboration, that 
an extensive comparison of component separation methods has 
been attempted on simulated data based on models of sky emis- 
sions of representative complexity. As will be seen and empha- 
sised throughout this paper, this aspect is critical for a mean- 
ingful evaluation of the performance of any separation method. 
In this respect, the present work significantly improves on the 
se mi-analytical estima tes of foreground contamination obtained 
by iBouchet & Gisperl ( fl999|) for the Planck pha se A study, as 
well as on previous work bv lTegmark et alj ( l2000h . 

The paper is organised as follows: In Section|2]we describe 
the sky emission model and simulations that were used, and de- 
scribe the methodology of the Challenge. In Section[3]we give an 
overview of the methods that have been implemented and took 
part in the analysis. In Section|4]we describe the results obtained 
for CMB component separation and power spectrum estimation. 
The results for point sources, SZ and Galactic components are 
described in Section|5]and in Section|6]we present our summary 
and conclusions. In Appendix|A]we provide a more detailed de- 
scription of the methods, their implementation details and their 
strengths and weaknesses. 



2. The challenge 

The objective of the component separation challenge discussed 
herein is to assess the readiness of the Planck collaboration to 
tackle component separation, based on the analysis of realisti- 
cally complex simulations. It offers an opportunity for compar- 
ing the results from different methods and groups, as well as 



' A similar data challenge has been undertaken in the past in the 
context of simulated WMAP an d sub-orbital CMB data (the WOMBAT 
challenge; iGawiser et"aLlll998h . 
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to develop the expertise, codes, organisation and infrastructure 
necessary for this task. 

This component separation challenge is designed so as to 
test on realistic simulated data sets, component separation meth- 
ods and algorithms in a situation as close as possible to what is 
expected when actual Planck data will be analysed. Hence, we 
assume the availability of a number of ancillary data sets. In par- 
ticular, we assumed that six-year WMAP observations will be 
available. Although WMAP is significantly less sensitive than 
Planck, it provides very useful complementary information for 
the separation of low-frequency Galactic components. This sec- 
tion describes our simulations, the challenge setup, and the eval- 
uation methodology. 

2.1. Sky emission 

Our sky simulations are based on an early development version 
of the Planck Sky Model (PSM, in preparation), a flexible soft- 
ware package developed by Planck WG2 for making predic- 
tions, simulations and constrained realisations of the microwave 
sky. 

The CMB sky is based on the observed WMAP multipoles 
up to ^ = 70, and on a Gaussian realisation assuming the WMAP 
best-fit C( at higher ni ultipoles. It is the same CMB map used by 
lAshdownetairfcOO?!) . 

The Galactic interstellar emission is described by a three 
component model of the interstellar medium comprising of free- 
free, synchrotron and dust emissions. The predictions are based 
on a number of sky templates which have different angular res- 
olution. In order to simulate the sky at Planck resolution we 
have added small scale fluctuations to s ome of the templates. The 
procedure used is the one presented in iMiville-Deschenes et al.l 
(l2007h which allows to increase the fluctuation level as a func- 
tion of the local brightness and therefore reproduce the non- 
Gaussian properties of the interstellar emission. 

Fre e-free emission is based on the model of [Dickinson et al.l 
(l2003h assuming an electronic temperature of 7000 K. The spa- 
tial structure of the emission is estimated using a Ha template 
corrected for dust extinction. The Ha map is a combination of 
the Southern H-Alpha Sky Survey Adas (SHASSA) and the 
Wisconsin H- Alpha Mapper (WHAM). The combined map was 
smoothed to obtain a uniform angular resolution of 1°. For the 
extinction map we use the E(B-V) all-sky map of Schleg el et alj 
( 1199 8) which is a combination of a smoothed IRAS 100 //m map 
(with resolution of 6.1') and a map at a few degrees resolution 
made from DIRBE data to estimate dust temperature and trans- 
form the infrared emission in extinction. As mentioned earlier, 
small scales were added in both templates to match the Planck 
resolution. 

Synchrotron emission is based on an extrapolation of the 
408 MHz map of iHaslam et alj ( Il982h from which an estimate 
of the free-free emission was removed. The spectral emission 
law of the synchrotron is assumed to follow a perfect power 
law, T^J"^ oc We use a pixel-dependent spectral index /3 de- 
rived from the ratio of the 408 MHz map and the estimate of the 
synchrotron emission at 23 GHz in the WMAP data obtained 
by Bennett etal. (2003b) using a Maximum Entropy Method 
technique. A limitation of this approach is that this synchrotron 
model also contains any 'anomalous' dust correlated emission 
seen by WMAP at 23 GHz. 

The thermal emission from interstellar dust is estimated us- 
ing model 7 of Finkbeiner et al. ( 1999). This model, fitted to the 
FIRAS data (7° resolution), makes the hypothesis that each line 
of sight can be modelled by the sum of the emission from two 
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Fig. 1. Synchrotron and dust (effective powerlaw) spectral in- 
dices evaluated between 30 and 44 GHz, and 143 and 217 GHz 
respectively (in /vKrj). A spatially varying spectral index corre- 
sponds to a foreground morphology that varies with frequency. 

dust populations, one cold and one hot. Each grain population 
is in thermal equilibrium with the radiation field and thus has a 
grey-body spectrum, so that the total dust emission is modelled 
as 

2 

!=1 

where By(Ti) is the Planck function at temperature T,. In model 
7 the emissivity indices are /3i - 1.5, /32 - 2.6, and /i = 0.0309 
and /2 - 0.9691. Once these values are fixed, the dust temper- 
ature of the two components is determined using only the ratio 
of the observations at two wavelengths, 100 //m and 240 jum. 
For this purpose, we use t he 100/240 yum map ratio published 
by Finkb einer et alj (Il999l) . Knowing the temperature and /3 of 
each dust component at a given position on the sky, we use the 
100 fim brightness at that position to scale the emission at any 
frequency using Eq.([T]i. We emphasise that the emission laws of 
the latter two components, synchrotron and dust, vary across the 
sky as shown in Figure[T] The spectral index of free-free is taken 
to be uniform on the sky since it only depends on the electronic 
temperature, taken as a constant here. 

Point sources are modelled with two main categories: radio 
and infra-red. Simulated radio sources are based on the NVSS 
or SUMSS and GB6 or PMN catalogues. Measured fluxes at 1 
and/or4.85 GHz are extrapolated to Planck frequencies assum- 
ing a distribution in flat and steep populations. For each of these 
two populations, the spectral index is randomly drawn from 
within a set of values compatible with the typically observed 
mean and dispersion. Infrared sources are based on the IRAS 
catalogue, and modelled as dusty galaxies dSerieant & HarrisonI 
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l2005h . IRAS coverage gaps were filled by adding simulated 
sources with a flux distribution consistent with the mean counts. 
Fainter sources were assumed to be mostly sub-millimeter bright 
galaxies, such as those detected by S CUBA surveys. These 
were modelled following iGranato et al.l (12004.) and assumed to 
be strongly clustered, with a comoving clustering radius ro ^ 
8 h"' Mpc. Since such sources have a very high areal density, 
they are not simulated individually but make up the sub-mm 
background. 

We also include in the model a map of thermal SZ spectral 
distortions from galaxy clusters, based on a simulated cluster 
catalogue drawn from a mass-function compatible with present- 
day observati ons and with ACDM parameters - 0.3, h = 0.7 
and erg = 0.9 dColafrancesco et al.lll997l:lde Zotti et al]|2005h . 

Component maps are produced at all Planck and WMAP 
central frequencies. They are then co-added and smoothed with 
Gaussian beams as indicated in Table [1] A total of fourteen 
monochromatic maps have been simulated. 

Finally, inhomogeneous noise is obtained by simulating the 
hit counts corresponding to one year of continuo us observations 
by Pl anck, using the Level-S simulations tool dReinecke et alj 
|2006|) . An example of a hit count map is shown in the upper 
panel of Figure |2l WMAP six year hit counts, obtained from 
scaling up the observed WMAP three year hit count patterns, are 
used to generate inhomogeneous noise in the simulated WMAP 
observations. The RMS noise level per hit for both experiments 
is given in Table[T] 

2.2. Challenge setup 

The simulated data sets were complemented by a set of ancil- 
lary data including bitmaps and noise levels, IRAS, 408 MHz, 
and Ha templates, as well as catalogues of known clusters from 
ROS AT and of known point sources from NVSS, SUMSS, GB6, 
PMN and IRAS. 

The Challenge proceeded first with a blind phase lasting 
around four months between August and November 2006, when 
neither the exact prescription used to simulate sky emission from 
these ancillary data sets, nor maps of each of the input compo- 
nents, were communicated to challenge participants. 

After this phase and an initial review of the results at the 
WG2 meeting in Catania in January 2007, the Challenge moved 
to an open phase lasting from January to June 2007. In this phase 
the input data — CMB maps and power spectrum. Galactic emis- 
sion maps, SZ Compton y parameter map, point source cata- 
logues and maps, noise realisations — were made available to the 
participating groups. 

All of the results presented here have been obtained after 
several iterations and improvements of the methods, both dur- 
ing the comparison of the results obtained independently by the 
various teams, and after the input data was disclosed. Hence, 
the challenge has permitted significant improvement of most of 
the methods and algorithms developed within the Planck col- 
laboration. The analysis of the Challenge results was led by the 
simulations team, with involvement and discussion from all par- 
ticipating groups. 

Deliverables 

A set of standard deliverables were defined. These included: a 
CMB map with 1.7' pixels (Healpix A^side - 2048) together with 
a corresponding map of estimated errors; the eff'ective beam F(, 
which describes the total smoothing of the recovered CMB map 
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Fig. 2. Upper panel: Hit counts for the 143 GHz channel. The in- 
homogeneities at the ecliptic poles are characteristic of Planck's 
cycloidal scanning strategy. Lower panel: The masking scheme 
separating the sky in three regions of different foreground con- 
tamination. The grey region at high Galactic latitudes is Zone 1, 
covering /sky - 74%. The darker region at lower Galactic lati- 
tudes is Zone 2 and covers /sky = 22%. The remaining region 
(green) along the Galactic ridge is Zone 3. The point source 
mask (red) covers 4% of Zone 1. The SZ mask (yellow) cuts 
detected SZ clusters at Galactic latitudes above 20 degrees, cov- 
ering 1 .4% of sky. 



due to a combination of instrumental beams and the filtering 
induced by the component separation process; a set of binned 
CMB power spectrum estimates (band averages of {{{ + l)C() 
and error bars; maps of all the diffuse components identified in 
the data; catalogues of the infrared and radio sources, and SZ 
clusters; a map of the SZ Compton y parameter. 

Masks 

Different separation methods are likely to perform differently in 
either foreground-dominated or noise-dominated observations. 
Also, they may be more or less sensitive to different types of 
foregrounds. Since the level of foreground contamination varies 
strongly across the sky, we used a set of standard masks through- 
out this work, and they are shown in the lower panel of Figure|2] 

The sky is split into three distinct Galactic 'Zones': Zone 1 
is at high Galactic latitudes and covers 74% of sky, similar to the 
WMAP KpO mask with smoother edges and small extensions. 
Zone 2 is at lower Galactic latitudes and covers 22% of sky. The 
remaining 4% of sky is covered by Zone 3, which is similar to 
the WMAP Kp 12 mask. 

The point source mask is the product of nine masks, each 
constructed by excluding a two FWHM region around every 
source with a flux greater than 200 mJy at the corresponding 
Planck frequency channel. This point source mask covers 4% of 



S. M. Leach et al: Component separation methods for the Planck mission 



5 



Table 1. Characteristics of Planck one year simulations (upper) and WMAP six year simulations (lower). Planck and WMAP 
hit counts correspond to 1.7' (Healpix «side=2048) and 6.8' («side=512) pixels respectively. Nc is the white noise level calculated 
from the inhomogeneous distribution of hits. 



Channel 


30 GHz 


44 GHz 


70 GHz 


100 GHz 


143 GHz 


217 GHz 


353 GHz 


545 GHz 


857 GHz 


FWHM [ai-cmin] 


33 


24 


14 


10 


7.1 


5 


5 


5 


5 


0"hil [juKrj] 


1030 


1430. 


2380 


1250 


754 


610 


425 


155 


72 


0"hit [/^Kcmb] 


1050 


1510 


2700 


1600 


1250 


1820 


5470 


24700 


1130000 


Mean; Median hits per pixel 


82; 64 


170; 134 


579; 455 


1010; 790 


2260; 1790 


2010; 1580 


2010; 1580 


503; 396 


503; 396 


N)'- [//Kcmb] 


0.066 


0.065 


0.063 


0.028 


0.015 


0.023 


0.068 


0.62 


28.4 



Channel 


23 GHz (K) 


33 GHz (Ka) 


41 GHz (Q) 


61 GHz (V) 


94 GHz (W) 


FWHM [arcmin] 


52.8 


39.6 


30.6 


21 


13.2 


CThit Ij"Krj] 


1420 


1420 


2100 


2840 


5210 


C hit I/'KcMb] 


1440 


1460 


2190 


3120 


6500 


Mean; Median hits per pixel 


878; 792 


878; 790 


2198; 1889 


2956; 2577 


8873; 7714 


Nf I/^Kcmb] 


0.10 


0.10 


0.10 


0.12 


0.14 



sk y in Zone 1 . For comp arison, the WMAP point source masks 
of iBennett etal] (l2003bj ) excludes a radius of 0.7° around almost 
700 sources with fluxes greater than 500 mJy, covering a total of 
2% of sky. 

The SZ mask is constructed by blanking out small circular 
regions centered on 1625 SZ clusters detected with the needlet- 
ILC + matched filter method (see Section l572] ). For each of them, 
the diameter of the cut is equal to the virial radius of the corre- 
sponding cluster. 

2.3. Comments about the sky emission simulations 

A note of caution about these simulations of sky emission is in 
order. Although the PSM, as described above, has a consider- 
able amount of sophistication, it still makes some simplifying 
assumptions - and cannot be expected to describe the full com- 
plexity of the real sky. This is a critical issue, as component sep- 
aration methods are very sensitive to these details. We mention 
four of them. 

First, Galactic emission is modelled with only three com- 
ponents, with no anomalous emission at low frequencies. This 
affects the spectral behaviour of components in the lower fre- 
quency bands below 60 GHz where the anomalous emission is 
thought to be dominant ("Pa vies et aT]|2006l; iBonaldi et al.ll2007t 
[Mivifle-Deschenes et al. 200^^ 

Second, even though variable spectral emission laws are used 
for synchrotron and dust emission, this is still an idealisation: for 
the synchrotron, the emission law in each pixel is described by a 
single spectral index without any steepening. For dust, the emis- 
sion is modelled as a superposition of two populations, with dis- 
tinct but fixed temperature and emissivity. These approximations 
impact component separation, since almost perfect estimation of 
the relevant parameters of a given foreground emission is pos- 
sible at frequencies where this foreground dominates, thereby 
allowing perfect subtraction in the cosmological channels. 

Third, it is worth mentioning that only low resolution (~ 1°) 
templates are available for synchrotron and free-free emissions. 
Hence, addition of small-scale power is critical: if such scales 
were absent from the simulations, but actually significant in the 
real sky, one might get a false impression that no component 
separation is needed on small scales. Also, the detection of point 
sources as well as galaxy clusters would be significantly eas- 
ier, hence not representative of the actual problem. Here, miss- 
ing small scale features are simulated using a non-stationary 
coloured Gaussian random field. Although quite sophisticated. 



this process can not generate for instance, filamentary or patchy 
structures known to exist in the real sky. 

Fourth, our simulations are somewhat idealised in the sense 
that we use perfect Gaussian beams, assume no systematic ef- 
fects, and assume that the noise is uncorrected from pixel to 
pixel and from channel to channel. Also, the effect of the finite 
bandpass of the frequency channels is not taken into account, and 
we assume that the calibration and zero levels of each channel is 
perfectly known. 

In spite of these simplifications, component separation re- 
mains a difficult task with our simulated data because of pixel- 
dependent spectral emission laws for dust and synchrotron, and 
of the presence of more than a million point sources with dif- 
ferent emission laws, of hundreds of thousands of unresolved or 
extended SZ clusters, and of significant emission from a com- 
plex IR background. It is fair to say that this simulated sky is far 
more complex than anything ever used in similar investigations. 

In closing this Section, we show in Figure |3] the angular 
power spectra of the basic components for the 70 and 100 GHz 
channels, close to the foreground emission minimum. The spec- 
tra of CMB, noise and thermal SZ are compared to the spectra 
of the total Galactic emission evaluated at high and low Galactic 
latitudes, on Zone 1 and 2 respectively. The point source spec- 
tra are evaluated in Zone 1, both with and without the bright- 
est sources above 200 mJy masked. Figure |3] shows the obvi- 
ous impact on CMB studies of masking the most foreground- 
contaminated regions. It also indicates that there is a significant 
region of sky. Zone 2, for which Galactic emission and CMB 
power are comparable. In the following sections, results are eval- 
uated independently in both Zones 1 and 2. 



3. Outline of the methods 

In this section we present a brief overview of the methods that 
have been used in this challenge. The section is divided in three 
parts, one for diffuse component separation methods, one for 
point source extraction, and one for SZ cluster extraction. 

3.1. Diffuse component separation 

The spirit of each method tested on the challenge data is out- 
lined here. A more detailed description, including some details 
of their implementations and a discussion of their strengths and 
weaknesses is presented in AppendixlAl 
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Table 2. Some characteristics of the diffuse component separation methods used in the challenge. 





Channels used 


Components modelled 


Resources and runtime 


>^ wiviivi/\iN i-^ r_. tx 


WiVlrtr, rLANCK jU— JJJ ^jrii, 


CMB, dust, sync, FF, mono-,dipoles 


iuuu L^r u nr, z uay 


CCA 


Planck, Haslam 408 MHz 


CMB, dust, sync, FF 
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Fig. 3. Spectra of the simulated microwave sky components near 
the foreground minimum. CMB, noise with the effect of beam 
deconvolution, and the thermal SZ effect are evaluated on the 
full sky; point source power is evaluated on Zone 1+2 both with 
and without sources above 200mJy masked; the galaxy power 
spectra are evaluated on Zone 1 and on Zone 2. The well-known 
importance of masking is evident, as is the fact that there is a 
significant proportion of sky (Zone 2, /sky - 22%) for which 
Galactic emission is comparable to CMB power. 

First we define some relevant terminology. The data model 
for a given channel v is 



by * Xy + riy 



(2) 



where dy, Xy, Uy are respectively the observation map, the sky 
emission map and the noise map at frequency v while by is the 
instrumental beam of channel v, assumed to be Gaussian sym- 
metric, and * denotes convolution on the sphere. The sky emis- 



sion itself, Xy, is a superposition of components. Most methods 
assume (implicitly or explicitly), that it can be written as a linear 
mixture 



Y,Ay 



(3) 



where the sum runs over the components. In matrix-vector for- 
mat, this reads x - ks where A is referred to as the 'mixing 
matrix'. Vector s is the vector of components. Vectors d and n 
are defined similarly. When this model holds, Eq. (|2]i becomes 



dy - by* 



z 



+ riy 



(4) 



In simple models, matrix A is constant over the sky; in more 
complex models, it varies over patches or even from pixel to 
pixel. 

We now briefly describe each of the methods that performed 
component separation of the CMB (and possibly other diffuse 
components), and also mention how the CMB angular power 
spectrum is estimated. 



- Gibbs sampling (Commander: lEriksen et alJl2008h . The ap- 
proach of Commander is to fit directly an explicit parametric 
model of CMB, foregrounds and noise to the antenna tem- 
perature of low-resolution map pixels. For the Challenge, 
Commander was used to analyse the data smoothed to 3° 
resolution at each channel with a pixel size of 54' (Healpix 
A^side=64). For a given foreground model. Commander pro- 
vides an exact foreground-marginalised CMB Cc distribu- 
tions using the Gibbs (conditional) sampling approach; 

- Correlated component analysis (CCA; Bed ini et aLi2005h . 
The CCA approach starts with an estimation of the mix- 
ing matrix on patches of sky by exploiting spatial correla- 
tions in the data, supplemented by constraints from external 
templates and foreground scaling modeling. The estimated 
parameters are then used to reconstruct the components by 
Wiener filtering in the harmonic domain. The Cc are esti- 
mated from the recovered CMB map. 

- Independent component analysis (FastICA; iMaino et aTl 
2002). The FastICA method is a popular approach to blind 
component separation. No assumptions are made about the 
frequency scaling or mixing matrix. Instead, assuming statis- 
tical independence between CMB and foregrounds, the mix- 
ing matrix is estimated by maximizing the non-gaussianity 
of the 1 -point distribution function of linear combinations of 
input data. The inferred mixing matrix is used to invert the 
linear system of Eq. (|4|l. The Cf 's are estimated from the re- 
covered CMB map. 

- Harmonic-space maximum entropy me thod (FastMEM; 
iHobson et all 119981; IStolvarov et all 120021) . The FastMEM 
method estimates component maps given frequency scaling 
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models and external foreground power spectra (and cross- 
power spectra) with adjustable prior weight. It is a non- 
blind, non-linear approach to inverting Eq. (|4|i, which as- 
sumes a maximum-entropy prior probability distribution for 
the underlying components. The C^'s are estimated from the 
recovered CMB component. 

- Generalised morphological component analysis (GMCA; 
^obin et al. 2007). Generalised Morphological Component 
Analysis is a semi-blind source separation method which dis- 
entangles the components by assuming that each of them is 
sparse in a fixed appropriate waveform dictionary such as 
wavelets. For the Challenge two variants of GMCA were 
applied: GMCA-blind was optimised for separation of the 
CMB component, and GMCA-model was optimised for sep- 
aration of galactic components. The 's are estimated from 
the recovered CMB map from the GMCA-blind method. 

- Spectral estimation via exp ectation maximisation 
(SEVEM; iMartmez-Gonzalez et al, 2003.) . SEVEM per- 
forms component separation in three steps. In a first step, an 
internal template subtraction is performed in order to obtain 
foreground-reduced CMB maps in three centre channels 
(100-217 GHz). Then the CMB power spectrum is estimated 
from these maps, via the EM algorithm, assuming a signal 
plus (correlated) noise model. A final CMB map is obtained 
using a harmonic Wiener filter on the foreground-reduced 
maps. 

- Spectral matching in depende nt component ana lysis 

(SMICAi lDefabrouille et al...2003f:ICardoso et al.ll2008l) . The 
SMICA method estimates model parameters using observa- 
tion correlations in the harmonic domain (auto- and cross- 
spectra). The estimated parameters are typically some mix- 
ing coefficients and the power spectra of independent com- 
ponents. For the challenge, the correlations between Galactic 
components are taken into account. The estimated parame- 
ters are then used to Wiener-filter the observations to obtain 
component maps. At small scales the Cf 's are one of the es- 
timated parameters. At large scales ( < 100 the C^'s are es- 
timated from a CMB map obtained using the ILC method. 

- Wavelet based high resolution fitting of internal tem- 
plates (WI-FIT: iHansen et alj 1200 6). The WI-FIT method 
computes CMB-free foreground plus noise templates from 
differences of the observations in different channels, and uses 
those to fit and subtract foregrounds from the CMB domi- 
nated channels in wavelet space. The Cf 's are estimated from 
the recovered CMB map. 

Some characteristics of these methods are summarised in 
Table |2l which shows the data used, the components modelled 
and a rough indication of the computational resources required. 

Note that many different approaches to diffuse component 
separation are represented here: blind, non-blind, semi-blind; 
methods based on linear combinations for foreground extraction; 
likelihood based methods which estimate parameters of a model 
of the foregrounds and the CMB; a maximum entropy method; 
methods based on cross correlations; a method based on sparsity. 
They also rely on very diverse assumptions and models. 

3.2. Point source extraction 

In the present challenge, point sources are detected in all Planck 
channels independently. Two methods are used, the first based 
on a new implementation of matched filtering, and the second 
using the second member of the Mexican Hat Wavelet Family of 



filters (iGonzalez-Nuevo et al.ll2006h . Point sources are detected 
by thresholding on the filtered maps. 

This corresponds to a first step for effective point source de- 
tection. It does not exploit any prior information on the position 
of candidate sources; Such information can be obta ined from ex- 
ternal catalogues as in lLopez-Caniego et al.l (l2007l) . or from de- 
tections in other Planck channels. Neither does this approach 
exploit the coherence of the contaminants throughout Planck 
frequencies, nor try to detect point sources jointly in more than 
one channel. Hence, there is margin for improvement. 

- Matched Filter (MF): The high spatial variability of noise 
and foreground emission suggests using local filters (for in- 
stance on small patches). The sky is divided into 496 over- 
lapping circular regions 12 degrees in diameter Matched fil- 
tering is applied on each patch independently. A local esti- 
mate of the power spectrum of the background is obtained 
from the data themselves by averaging the power in circu- 
lar frequency bins. A first pass is performed to detect and 
remove the brightest sources (above 20cr), in order to re- 
duce the bias in background power estimation and to reduce 
possible artifacts in the filtered maps. Having removed these 
bright sources, the Scr level catalogue is obtained by a second 
application of the whole procedure. 

- Mexican Hat Wavelet (MHW2): In a similar way, the sky 
is divided into 371 square patches. The size of each patch 
is 14.65 X 14.65 square degrees, with a 3 degree overlap 
among patches. Each patch is then individually filtered with 
the MHW2. For each patch, the optimal scale of the wavelet 
is obtained by means of a fast maximization of the wavelet 
gain factor. This step requires only a straightforward estima- 
tion of the variance of the patch, excluding the border and 
masking any sources above 30cr. A 5cr level catalogue is ob- 
tained by simple thresholding in a single step. 

3.3. SZ cluster extraction 

In the present data challenge, we address both the question of 
building an SZ catalogue, and of making a map of thermal SZ 
emission. 

SZ map: Three methods successfully produced SZ maps: ILC 
in harmonic space, ILC on a needlet frame, and SMICA. For 
ILC methods, the data are modelled as d - as + n where d is 
the vector of observations (nine maps here, using Planck data 
only), a is the SZ spectral signature at all frequencies (a vector 
with nine entries), s is the component amplitude and n is the 
noise. The ILC provides an estimator 7[lc of s using 

^ilc = d (5) 

a' R 1 a 

where R is the empirical correlation of the observations, i.e. a 9x 
9 matrix, with entries In practice, the filter is implemented 
in bands of £ (ILC in harmonic space) or on subsets of needlet 
coefficients (ILC in needlet space). The needlet-ILC adapts to 
the local background to recover the SZ sky. 
SZ catalogue: Three main methods were used to obtain the clus- 
ter catalogue: 

- The first one us es a single frequency matched filter 
(iMeUn et alj|2006l) to extract clusters from the needlet-ILC 
map. 

- The second one uses SExtractor (iBertin & Arnoutsll 19961) to 
extract clusters from the needlet-ILC map. Then, a single fre- 
quency matched filter is used to estimate cluster fluxes. 
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- The third is a Matched MultiFilter dHerranz et all l2002h . 
which implements cluster detection using the full set of input 
observations rather than from an intermediate SZ map. This 
third method is implemented independently in Saclay and in 
Santander. 

The performance of these four methods is detailed in Table |4] 
The comparison is done at the same contamination level 
(~ 10%), which corresponds to S/N > 4.7 for the needlet-ILC 
+ MF catalogue, S/N > 3.8 for the needlet-ILC + SExtractor 
catalogue, S /N > 4.3 for the Matched Multifilter (MMF) Saclay 
catalogue and S /N > 4.6 for the MMF Santander catalogue. 

This comparison is being extended to other cluster extrac- 
tion methods in collaboration with the Planck 'Clusters and 
Secondary Anisotropics' working group (WG5). Some improve- 
ments are obtained using SExtractor as the extraction tool after 
the component separation step. There is still some margin for 
other improvements by increasing the studied area to include 
lower Galactic latitudes and by combining the SZ extraction 
methods with CMB and Galactic extraction methods more in- 
timately. 

4. Results for CMB 

We now turn to the presentation and discussion of the results of 
the challenge, starting with the CMB component. We evaluate 
performance based on residual errors at the map and spectral 
level, and on residual errors at the power spectrum estimation 
level. 

The first point to be made is that all methods have pro- 
duced CMB maps in Zones 1 and 2. Foreground contamination 
is barely visible. A small patch representative of CMB recon- 
struction at intermediate Galactic latitude, is shown in Figure |9] 
In the following, we focus on the analysis of the reconstruction 
error (or residual). 

Since each method produces a CMB map at a different reso- 
lution, the recovered CMB maps are compared both against the 
input CMB sky smoothed only by the l.T pixels, and against a 
45' smoothed version, in order to the emphasise errors at large 
scales. 

4. 1 . Map-level residual errors 

Maps of the CMB reconstruction error, with all maps smoothed 
to a common 45' resolution, are shown in Figure |5] for all of 
the methods (excluding Commander, which produced maps at 3° 
resolution). The remaining Galactic contamination is now visi- 
ble at various levels for most methods, and in particular close to 
regions with the strongest levels of free-free emission. There is 
also evidence of contamination by SZ cluster decrements, which 
are visible as distinct negative sources away from the Galactic 
plane. As can be seen, significant differences between methods 
exist. 

- At high Galactic latitudes, at this 45' scale, the lowest 
contamination is achieved by SMICA, GMCA-BLIND and 
FastlCA. 

- In Zone 2, CCA, GMCA-MODEL, and FastMEM seem to 
filter out Galactic emission best while FastlCA and WI-FIT 
are strongly contaminated. 

A quantitative measure of the raw residual of the CMB map 
(reconstructed CMB minus unsmoothed input CMB) is provided 
by its RMS, calculated for 18 zonal bands, each 10 degrees wide 
in Galactic latitude, excluding pixels in Zone 3 and from the 
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Fig. 4. (Upper) RMS of the residual eiTor of the CMB map, cal- 
culated for each of 18 bands of 10 degrees width in Galactic lat- 
itude. For comparison, itcmb = IQA-.S/jK and cTnoisg - 29.3yL(K, 
for the 143 GHz channel alone (1.7' pixels). (Lower) RMS of 
this residual map calculated at 45' resolution. For comparison, 
o-cmb(45') = 69.8;uK and cr„oise(45') = 0.7//K for the 143 GHz 
channel. The corresponding residual maps are shown in Figure|5] 



point source mask. The results are shown in the upper panel 
of Figure m This quantity, denoted cr^cMB, gives a measure of 
the sum of the errors due to residual foreground contamination, 
noise, as well as from residual CMB (due to non unit response on 
small scales, for instance). For orientation, we can see that the 
ensemble of methods span the range 13juK< (Tacmb < 35;uK, 
which can be compared with ctcmb = 104.5/iK and cTnoise = 
29.3;uK, for the 143 GHz channel. 

Similarly, the lower panel of Figure |4] shows the RMS of 
the smoothed residual errors shown in Figure |5] Depending on 
the method, the typical level of foreground contamination (plus 
noise) has an RMS from 2 to 5/vK on this smoothing scale. For 
comparison, crQyiB(45') - 69.8;uK and cr„oise(45') - Q.ljjK for 
the 143 GHz channel. 

4.2. Spectral residual errors 

Next we calculate the spectra of the CMB raw residual maps, 
both on Zone 1 and Zone 2 (high and low Galactic latitudes). 
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Fig. 5. CMB reconstruction error smootlied at 45' resolution. These maps are described in Section l4Tn and their RMS in Galactic 
latitude strips of 10° are shown in FigureH] Over a large fraction of sky, the typical RMS of the residual error is between 2 and 5/iK, 
which can be compared with crcMB(45') - 69.8;uK and crnoise(45') - 0.7 fiK for the 143 GHz channel. Some contamination from the 
galactic components and bright clusters remains. 



with the brightest point sources masked. The results are shown 
in Figure |6] 

By comparing the spectra of the residuals with the original 
level of diffuse foreground contamination shown in Figure |3] we 
can see that a considerable degree of diffuse foreground clean- 
ing has been attained. There seems however to be a 'floor' ap- 
proached by the ensemble of methods, with a spread of about 



a factor of ten indicating diff'erences in performance. This floor 
appears to be mostly free of residual CMB signal which would 
be visible as acoustic oscillations. 

Its overall shape is not white: at high Galactic latitudes the 
residual spectra bottom out at very roughly A = 0.015 x^"*'-^yuK^, 
while a low Galactic latitudes the spectra bottom out at A = 
0.02 X f " V^^. This limit to the level of residuals is consider- 
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Fig. 6. Spectra of the CMB residual maps, evaluated on Zone 1 
(high Galactic latitudes) and Zone 2 (low Galactic latitudes), 
both regions with point sources masked. Comparison with 
Figure |3] shows the extent to which the Galactic contamination 
has been removed from the CMB on large angular scales. 



ably higher than the 'foreground-free' noise limit displayed as a 
dashed line. 

It is, however, also significantly lower than the CMB cos- 
mic variance, even with 10% binning in This comforts us 
in the impression that component separation is effective enough 
for CMB power spectrum estimation (discussed next in this pa- 
per), although it may remain a limiting issue for other type of 
CMB science. In particular, it suggests that the component sep- 
aration residuals, with these channels and the present methods, 
will dominat e the error in Planck CMB maps. 

Recently iHuffenberger et all ( l2008l) performed a reanalysis 
of the impact of unresolved point source power in the WMAP 
three-year data. They found that cosmological parameter con- 
straints are sensitive to the treatment of the unresolved point 
source power spectrum beyond t - 200 characterised by a white 
noise level of A = 0.015 + 0.005/^K^. By comparison, the resid- 
ual foreground contamination obtained in our simulations is as 



low as 4 X lO ^K^ at ^ = 200. 



4.3. Power spectrum estimation errors 

Although not the main focus of effort for the Challenge, each 
group provided their own bandpower estimates of the CMB 



power spectrum, which in many cases showed obvious acoustic 
structure out to the sixth or seventh acoustic peak ai I ~ 2000. 
As an illustration of this result, we show in the upper and mid- 
dle panels of Figure [T] the power spectrum estimates from the 
Commander and SMICA methods respectively. 

To make a quantitative estimate of the accuracy of the power 
spectrum estimates D( of 1(1 + \)Cc, we calculate the quantity 



FoM = 



^CclCt 



(6) 



where ADf is the bias in the PSE compared to the PSE derived 
from the input CMB sky, and where hCcjCt is the expected ac- 
curacy of Planck, obtained from Eq. (|7| below. This figure of 
merit penalises biases in the power spectrum estimates without 
taking into account the error bars claimed by each group. 

In the absence of foregrounds, an approximate lower bound 
on the relative standard deviation in estimating the power spec- 
trum is given by 



i3 , 



where the number A^modes of available modes is 



A^modes = /sky ^ + 1) 



(7) 



(8) 



where /s^y denotes the fraction of sky coverage. The average 
noise power spectrum, N{, is obtained from the noise power 
spectra of the different channels 



V ^ 



V v=l 



2 



nhit(f ) ' 



(9) 



(10) 



where Byi is the beam window function for channel v, and using 
the calculated values of Nyc given in Table [T] This theoretical 
limit Eq. (|7]i is used below to assess the impact of foregrounds 
on power spectrum estimation, taking the 70 to 217GHz chan- 
nels and assuming the noise levels from Table[T]together with an 

/sky = 0.8. 

Ideally, the figure of merit given by Eq. ^ should be much 
less than one in the cosmic-variance limited regime (i.e. for 
€ < 500 according to Figure|6]l. Significant deviations from zero 
at low £ and over + 1 at high £ are indications of significant de- 
partures from optimality. We display the FoM Eq. (|6]l, calculated 
for the PSE of each method in the range 2 < ( < 1000, in the 
lower panel of Figure |7] 

Focusing first on the range ^ < 20 we can discern the best 
performance from Commander, which models the spatial varia- 
tion of the foreground spectral indices, thus improving the sub- 
traction of foregrounds on large scales. In the range 20 < I < 
500, SEVEM, specifically designed for an estimate of the CMB 
power spectrum, performs best among the methods tested on this 
challenge. Beyond £ - 500 we see the best performance from 
SEVEM and SMICA. 

At best, it seems that obtaining PSE with FoM < 1 is achiev- 
able for the multipole range 2 < £ < 1000. Overall though, 
complete convergence between the results from different meth- 
ods was not yet achieved. 
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Fig. 7. (Upper) Power spectrum estimates (PSE) using 
Commander on large angular scales. The diamonds show the Cc 
of the input CMB realisation. (Middle) PSE of the recovered 
CMB map using the SMICA method. (Lower) PSE compared 
with the estimates derived from the input CMB Cc, and with 
the expected Planck sensitivity, assuming /sky = 0.8. Beyond 
{ = 500 biases in the PSE set in in some of the methods. 



4.4. Discussion 

In closing this section on the results for the CMB component, we 
make some general observations, and attempt to explain some 
of the differences in these results, as shown in Figure |5] and 



with reference to the characteristics of the analyses as detailed 
in Table |2] 

The most significant foreground contamination, where it ex- 
ists, is associated with regions where free-free emission is most 
intense, such as the Gum nebula and Orion A and B, and this is 
most easily visible in the residual maps of FastICA and WIFIT 
(which also suffers from some dust contamination). Possibly this 
can be explained by the more limited frequency range of data ex- 
ploited in these two analyses. 

The WMAP data were used in the separation only by 
Commander and SMICA. In the Commander case, the inclu- 
sion of additional low frequency channels (in particular the 23 
GHz band) helps to recover the low frequency foregrounds. For 
SMICA, the use of the WMAP channels was not really manda- 
tory for extracting the CMB, rather they have been used with 
the objective of developing a pipeline which uses all observa- 
tions available. WMAP maps are expected to be useful for the 
extraction of low frequency galactic components. This specific 
aspect was not investigated further in the present work, as the 
simulations used here (which do not include any anomalous dust 
emission) are not really adequate to address this problem mean- 
ingfully. 



5. Results for other components 

5.1. Point sources 

Additional efforts have been directed towards producing a cata- 
logue of point sources, a catalogue of SZ clusters and maps of 
the thermal SZ effect and Galactic components. 

The detection of point sources is both an objective of Planck 
component separation (for the production of the Planck early re- 
lease compact source catalogue (ERCSC) and of the final point 
source catalogue), and also a necessity for CMB science, to eval- 
uate and subtract the contamination of CMB ma ps and power 
spect r a by this population of astro physical objects jWright et al.l 
120081: iGonzalez-Nuevo et alj|2008.) . 

The matched filter and the Mexican hat wavelet 2 have ad- 
vantages and drawbacks. In principle, the matched filter is the 
optimal linear filter However, it often suffers from inaccurate es- 
timation of the required correlation matrix of the contaminants, 
and from the difficulty to adapt the filter to the local contamina- 
tion conditions. On the data from the present challenge, this re- 
sulted in excessive contamination of the point source catalogue 
by small scale Galactic emission, mainly dust at high frequen- 
cies. 

Table[3]summarises the PS detection achieved by these meth- 
ods. We found that the Mexican hat wavelet 2 and the matched 
filter performed similarly in most of the frequency channels, and 
complement each other in the others. Performance depends on 
the implementation details, and on properties of the other fore- 
grounds. 

It should be noted that, for all channels, the 5(t detection 
limit is somewhat above what would be expected from (unfil- 
tered) noise alone (by a factor 1.33 for the best case, 44 GHz, 
to 4.8 for the worst case, 857 GHz). This is essentially due to 
the impact of other foregrounds and the CMB, as well as con- 
fusion with other sources. In particular, this effect is more ev- 
ident at 545 and 857 GHz, due to high dust contamination but 
also due to the confusion with the highly correlated population 
of SCUBA sources (Granato et al. 2004; Negrello et al. 200|), 
which constitute a contaminant whose impact on point source 
detection was until now somewhat underestimated. 
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Table 3. Results of point source detection on the present data challenge. 



Channel 


30 GHz 


44 GHz 


70 GHz 


100 GHz 


143 GHz 


217 GHz 


353 GHz 


545 GHz 


857 GHz 


MF: flux limit [mjy] (5% cont.) 


420 


430 


360 


220 


130 


100 


190 


890 


2490 


Detections 


655 


591 


623 


1103 


2264 


2597 


1994 


1200 


1132 


MHW2: flux limit [mJy] (5% cont.) 


395 


450 


380 


250 


140 


120 


230 


430 


2160 


Detections 


762 


621 


599 


1065 


2072 


2203 


1650 


1832 


1259 



Table 4. Performance of the SZ cluster detection methods. 

The table gives the absolute number of detection for \b\ > 20°. 



Method 


Detections 


False 


Reliability 


Needlet-ILC + SExt. 


2564 


225 


91% 


Needlet-ILC + MF 


1804 


179 


90% 


MMF Saclay 


1803 


178 


90% 


MMF IFCA 


1535 


144 


91% 



The number of detections for each frequency channel 
in T able |3] has been compared to the predictions made by 
iLopez-Caniego et al.l (12006 ). properly rescaled for our sky cov- 
erage. In general, there is a good agreement between the pre- 
dictions and the results of this exercise, except for the 857 GHz 
channel, where the number of detections is roughly half the pre- 
dicted one. Again, the difference may be due to the confusion of 
correlated infrared sour ces, that are now present in the PSM but 
were not considered bv lLopez-Caniego et alj (l2006h . 

5.2. SZ effect 

The recovery of an SZ map from the challenge data is illustrated 
in Figure [8] The recovered full sky SZ is obtained by Wiener- 
filtering in harmonic space the needlet-ILC map of the SZ y pa- 
rameter. Wiener filtering enhances the visibility of SZ clusters. 
We clearly identify by eye the brightest clusters in the map. 

One of the main results of this study is the recovery of around 
2300 clusters. This is significantly lower than the performance 
one could expect if the main limitation was the nominal Planck 
noise, and if most detectable clusters were unresolved. Many 
of the recovered clusters are in fact resolved, and thus emit on 
scales where the contamination from CMB is not negligible. 
Small scale Galactic emission and the background of extragalac- 
tic sources, now included in the simulations, further complicate 
the detection. Further study is necessary to find the exact origin 
of the lack of performance, and improve the detection methods 
accordingly. 

Actual detection performance, limited to 67% of the sky at 
Galactic latitudes above 20 degrees, is shown in Table |4] The 
ILC + SExtractor method gives the best result. The ILC-hMF ap- 
proach performs as well as the matched multifilter here. The two 
implementations of the MMF perform similarly. The difference 
in the number of detections achieved (about 13.5%), however, 
suggests that implementation details are important for this task. 

Using the detected cluster catalogue obtained with the MMF, 
we have produced a mask of the detected SZ clusters. For each 
of the 1625 clusters we masked a region whose radius is given by 
the corresponding input cluster virial radius (ten times the core 
radius here). 



5.3. Galactic components 

For the Challenge, a number of methods were applied for sep- 
arating out Galactic components. Table |2] lists which Galactic 
components were obtained by the different methods. Five groups 
have attempted to separate a high frequency dust-like compo- 
nent. Four groups have attempted separation of synchrotron and 
free-free at low frequencies. 

We compared the reconstructed component maps with their 
counterpart input maps, both in terms of the absolute residual 
error and in terms of the relative error. Both these measures are 
computed after removing the best-fit monopole and dipole from 
the residual eiTor map (fitted when excluding a region +30° in 
Galactic latitude). We then defined a figure of merit /2o%, which 
corresponds to the fraction of sky where the foreground ampli- 
tude is reconstructed with a relative error of less than 20%. 

The main results can be summarised as follows: The dust 
component was the best reconstructed component with an /2o% - 
0.7 for all methods. The relative error typically becomes largest 
at the higher galactic latitudes where the dust emission is 
faintest. The synchrotron component was reconstructed with an 
/2()% - 0.3-0.5, with Commander achieving the best results at 
3° resolution, but with noticeable errors along the galactic ridge 
where, in our simulations, the synchrotron spectral index flat- 
tens off. Free-free emission is detected and identified in regions 
such as the Gum Nebula, Orion A and B, and the Ophiucus 
complex. However, the reconstruction of the free-free emission 
at low Galactic latitudes needs improving. On the other hand, 
the total Galactic emission (free-free plus synchrotron) at low- 
frequencies is better reconstructed, with /2o% - 0.5-0.8, with 
the best results from Commander. 

In Figure |9] we show for illustration the recovered total 
Galactic emission at 23GHz from Commander, the dust emis- 
sion at 143GHz from FastICA and, for comparison, the recov- 
ered CMB from SMICA on the same patch. 

6. Summary and conclusions 

In this paper we have described a CMB component separation 
Challenge based on a set of realistic simulations of the Planck 
satellite mission. The simulated data were based on a develop- 
ment version of the Planck Sky Model, and included the fore- 
ground emission from a three component Galactic model of free- 
free, synchrotron and dust, as well as radio and infra-red sources, 
the infra-red background, the SZ effect and PLANCK-like inho- 
mogeneous noise. We have cautioned that the simulations, while 
complex, still relied on some simplifying assumptions, as dis- 
cussed in Section [23] Thus, there is no guarantee that the priors 
and data models that yielded the best separation on simulations 
will work equally well on the Planck dataset. While this may 
seem to undercut the main purpose of this paper, we are simply 
acknowledging that we cannot anticipate in full detail what the 
Planck component separation pipeline will look like and how 
effective it will be, based on an analysis of present-day Simula- 
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Fig. 8. Patch of the recovered needlet-ILC SZ map and input SZ map. For easier comparison of the two maps (12.5 deg x 
12.5 deg), the input SZ map has been filtered to the same resolution as the output. 



tions. In spite of this, it is clear that methods performing better 
on the simulated data have in general better chances to work bet- 
ter on the real sky. 

As a combined set of tools, the component separation meth- 
ods developed and tested in this work off'er very different ways 
to address the component separation problem and so comparable 
performance between different tools, when achieved, provides 
confidence in the conclusions of this work against some of the 
simplifications used in the model for simulating the sky emis- 
sion. 

We found that the recovered CMB maps were clean on large 
scales, in the sense that the RMS of the residual contamination 
was much less than the cosmic variance: at best the RMS resid- 
ual of the cleaned CMB maps was of the order of 2fiK across the 
sky on a smoothing scale of 45', with a spectral distribution de- 
scribed by A = 0.015 X ^-"-^K^ and A = 0.02 x r^ '^fiK^ at high 
and low Galactic latitudes respectively. The effectiveness of the 
foreground removal is illustrated by comparing the input fore- 
ground power spectra of Figure [3] with the residuals shown in 
Figure |6] The two panels of the latter figure show that, with few 
exceptions, the methods manage to clean the low Galactic lati- 
tude Zone 2, where the foreground contamination is high, almost 
as well as they do for the high Galactic latitude Zone 1, where 
the CMB dominates at the frequencies near the foreground min- 
imum. The amplitude of the power spectrum of residuals is, on 
the largest scales, four orders of magnitude lower than that of the 
input Galaxy power spectrum at the foreground minimum. This 
means that the CMB map could be recovered, at least by some 
methods, over the whole sky except for a sky cut at the 5 percent 
level (see Fig.|4]l. The CMB power spectrum was accurately re- 
covered up to the sixth peak. 

As detailed in Table |2] the outputs of the methods were 
diverse. While all have produced a CMB map, only a sub- 



set of them were used to obtain maps of individual dif- 
fuse Galactic emissions. Five (Commander, CCA, FastlCA, 
FastMEM, SMICA) reconstructed thermal dust emission maps 
at high frequencies, and another five (Commander, CCA, 
FastMEM, GMCA, SMICA) yielded a map of the low-frequency 
Galactic emissions (synchrotron and free-free). 

It is not surprising that the dust component was more eas- 
ily reconstructed because it is mapped over a larger frequency 
range, and benefits from observations at high frequencies where 
it dominates over all other emissions, except the IR background 
at high Galactic latitudes. Moreover at high frequencies the noise 
level is lower and the angular resolution is better Low frequency 
Galactic foregrounds suffer from more confusion, with a mixture 
of several components observed in only few channels, at lower 
resolution. 

The relative errors of the reconstructed foreground maps 
are larger at high Galactic latitudes where the foregrounds are 
fainter We have defined a figure of merit /2o%, which corre- 
sponds to the fraction of sky where the amplitude of each galac- 
tic component has been reconstructed with a relative error of 
less than 20%. For most methods, /2o% - 70% was achieved 
for the dust component, while /2o% - 50% was achieved for 
the radio emission, increasing to 80% if component separation 
is performed at a relatively low resolution of 3°. Clearly, there is 
ample room and need for improvement in this area. 

The flux limits for extragalactic point source detection are 
minimum at 143 and 217 GHz, where they reach ^ 100 mJy. 
About 1000 radio sources and about 2600 far-IR sources are de- 
tected over about 67% of the sky (\b\ > 20°). Over the same 
region of the sky, the best methods recover about 2300 clusters. 

In closing, we list areas where work is in progress and im- 
provements are expected: The sky model is being upgraded to 
include the anomalous emission component and polarization. 
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Fig. 9. Example input and recovered total galaxy emission at 23GHz, dust at 143GHz and CMB components. 
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We are in the process of integrating point source and SZ ex- 
traction algorithms together with the diffuse component sepa- 
ration algorithms into a single component separation pipeline. 
This is expected, on one side, to decrease the contamination of 
CMB maps on small angular scales, where point and compact 
sources (including SZ effects) dominate and, on the other side, 
to achieve a more efficient point and compact source extraction. 
Assessing this in more detail is part of WG2 work plans for the 
near future. The SEVEM foreground cleaning step now oper- 
ates in wavelet space which allows for improved, scale-by-scale 
removal of foregrounds. In addition the recovery of the power 
spectrum estimates and error bars at the highest multipoles has 
been improved by reducing the cross-correlation between modes 
through the use of an apodised mask. For Commander, work is 
currently ongoing to extend the foreground sampler to multi- 
resolution experiments. CCA is being upgraded to fully exploit 
the estimated spatially-varying spectral indices in the source re- 
construction step; SMICA is being improved to model unre- 
solved point source power. The FastICA algorithm is being im- 
proved to handle data with a wider frequency coverage. The 
GMCA framework is being extended to perform a joint sepa- 
ration and deconvolution of the components. 
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Appendix A: Description of methods 

A.1. Commander 

'Commander' is an implementation of the CMB and foreground s 
Gibbs sampler most recently described bv lEriksen et al.l (l2008l) : 
This algorithm maps out the joint CMB-foreground probability 
distribution, or 'posterior distribution', by sampling. The target 
posterior distribution may be written in terms of the likelihood 
and prior using Bayes' theorem, 

Pr(s, Q, 0fg|rf) - J:(d\s, %) Pr(*|Cf) Pr(Q) Pr(0fg). (A. 1) 

Here 6{g is the collection of all parameters required to describe 
the non-cosmological foregrounds. Since the noise is assumed to 
be Gaussian, the likelihood is simply given by the;^'^. 

In the current analysis, the foregrounds are modelled by a 
sum of synchrotron, free-free and thermal dust emission, and 
free monopole and dipoles at each frequency band. The thermal 
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dust component is approximated by a single-component modi- 
fied blackbody with a fixed dust temperature - 2 IK. Thus, 
the total foreground model reads 

sfi{v,p) = A,(/7)g(y)J-j +Af{p)g{v)\-^ 

3 

+mO + ^mt(n(p)-ei), (A.2) 

/=i 

where g{v) is the conversion factor between antenna and thermo- 
dynamic temperatures, and n is the unit vector of pixel p. The 
free parameters are thus the foreground amplitudes. As, Af and 
A^i, and spectral indices, /3s and fi^, for each pixel, and the over- 
all monopole, m^, and dipole amplitudes, mj,, for each band. For 
priors, we adopt the product of the Jeffreys' ignorance prior and 
an informative Gaussian prior (J3s = -3 + 0.3 for synchrotron 
and Pd = 1 .5 + 0.3 for dust) for the spectral indices, while no 
constraints are imposed on the amplitudes. 

Using the Gibbs (conditional) sampling technique, a set of 
samples drawn from the posterior distribution. From these sam- 
ples the marginal posterior mean and RMS component maps are 
derived, as well as the marginal CMB power spectrum posterior 
distribution. 

The code assumes identical beams at all frequencies, and 
it is therefore necessary to smooth the data to a common res- 
olution, limiting the analysis to large angular scales. For this 
particular data set, we have chosen a common resolution of 3° 
FWHM, with 54' pixels (Healpix A^side=64) and wit h^^ax = 150 
For m ore details on the degradation process, see Eriksen et al.l 
(|2008|) . At this resolution, the CPU time for producing one sam- 
ple is around one wall-clock minute. A total of 5400 samples 
were produced over four independent Markov chains, of which 
the first 2400 were rejected due to burn-in. Twelve frequency 
bands (covering frequencies between 23 and 353 GHz) were in- 
cluded, for a total cost of around 1000 CPU hours. 

The main advantage of this approach is simply that it pro- 
vides us with the exact joint CMB and foreground posterior dis- 
tribution for very general foreground models. From this joint 
posterior distribution, it is trivial to obtain the exact marginal 
CMB power spectrum and sky signal posterior distributions. 
Second, since any parametric foreground model may be included 
in the analysis, the method is very general and flexible. It also 
provides maps of the posterior means for individual components, 
and is therefore a true component separation method, and not 
only a foreground removal tool. 

Currently, the main disadvantage of the approach is the as- 
sumption of identical beam profiles at each frequency. This 
strictly limits the analysis to the lowest resolution of a particular 
data set. However, this is a limitation of the current implementa- 
tion, and not of the method as such. 

A.2. Correlated Component Analysis (CCA) 

CCA (iBedini et al.1 120051) is a semi-blind approach that relies 
on the second-order statistics of the data to estimate the mix- 
ing matrix on sub-patches of the sky. CCA assumes the data 
model given by Eq. (HJi, and makes no assumptions about the 
independence or lack of correlations between pairs of radiation 
sources. The method exploits the spatial structure of the indi- 
vidual source maps and adopts commonly accepted models for 



source frequency scahngs in order to reduce the number of free 
parameters to be estimated. 

The spatial structures of the maps are accounted for through 
the covariance matrices at different shifts. From the data model 
adopted, the data covariance matrices at shifts (t, i//) are given by 

Cd(T,(A) = ([d(0,0)-A(d][d(0 + T,0 + iA)-/"dr) 

= ACs(t, iA)A' + C„(t, (A) . (A.3) 

where is the mean data vector, and (6, (p) is the generic pixel 
index pair. The matrices Cd(T, ij/) can be estimated from the data, 
and the noise covariance matrices Cn(T, ij/) are derived from the 
map-making noise estimations. From Eq. ( |A.3t , we can estimate 
the mixing matrix and free parameters of the source covariance 
matrices by matching the known quantities to the unknowns, that 
is by minimizing the following function for A and C^ij, t//) 

J] ||ACs(T, ^)A' - [Cd(r, ^) - C„(T, ^)]||, (A.4) 

where the Frobenius norm is used and the summation is taken 
over the set of shift pairs (t, i//) for which data covariances are 
non-zero. Given an estimate of Cs and C„, Eq. (HI can be in- 
verted and component maps obtained via the standard inversion 
techniques of Wiener filtering or generalised least square inver- 
sion. For the Challenge, harmonic space Wiener filtering was 
applied, using a mixing matrix obtained by averaging the mix- 
ing matrice s of different patches. More details on the method can 
be found in lBonaldi et al] (I2006ll200l . 

CCA can treat the variability of the spectral properties of 
each component with the direction of observation by working 
on sufficiently small sky patches, which must however be large 
enough to have sufficient constraining power; typically the num- 
ber of pixels per patch must be around 10^. To obtain a con- 
tinuous distribution of the free parameters of the mixing ma- 
trix, CCA is applied to a large number of partially overlapping 
patches. 

A drawback of the present version of CCA is common to 
many pixel-domain approaches to separation: the data must be 
smoothed to a c ommon r esolution. A Fourier-domain implemen- 
tation of CCA jBedini & Salerno 2007) would be able to cope 
with this problem. Alternatively for the pixel-domain version, 
the mixing matrix could be estimated from the smoothed maps 
and then used to separate the sources using the full resolution 
data. 

A.3. Generalised morphological component analysis 
(GMCA) 

GMCA tBobin et a D|2007^ is a blind source separation method 
devised for separating sources from instantaneous linear mix- 
tures using the model given by Eq. (|4]l. The components s are 
assumed to be sparsely represented (i.e. have a few significant 
samples in a specific basis) in a so-called sparse representation 
(typically wavelets). Assuming that the components have a 
sparse representation in the wavelet domain is equivalent to as- 
suming that most components have a certain spatial regularity. 
These components and their spectral signatures are then recov- 
ered by minimizing the number of significant coefficients in O : 

min(a.s)/l||sO^|| + ^lld - as||^ (A.5) 

In lBobin et al.l (|2007|) . it was shown that sparsity enhances the 
diversity between the components thus improving the separation 
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quality. The spectral signatures of CMB and SZ are assumed to 
be known. The spectral signature of the free-free component is 
approximately known up to a multiplicative constant (power law 
with fixed spectral index). The synchrotron component is esti- 
mated via a separable linear model : dsync - a.nnc'S.svnr where 
dsync is parameterised by a spectral index /5sync- This spectral in- 
dex is estimated by solving the following problem: 



mm rs- 



c(/?)'SHaslamll 



(A.6) 



where Tsync is the residual obtained by extracting the contribution 
of all the components from the data d except synchrotron. ^Hasiam 
is the Haslam synchrotron map; asy„cifi) is the spectral signature 
of synchrotron emission (power law). More precisely, /? is esti- 
mated such that the Haslam multiplied by asyndP) matches the 
residual term r^ync- 

A Wiener filter is applied to provide the denoised CMB es- 
timate. The main advantage of GMCA is its ability to blindly 
extract strong galactic emission. Indeed, most galactic emission 
is well represented in a wavelet basis. The main disadvantage 
is that it relies on the way the deconvolution of the data is per- 
formed: an effective beam is used to account for the convolution. 



A.4. Independent component analysis (FastICA) 

Independent Component Analysis is an approach to component 
separation, looking for the components which max imise some 
measure of the statistical independence ( Hvvarinenii 1999) . The 
FastICA algorithm presented here exploits the fact that non- 
Gaussianity is usually a convenient and robust measure of the 
statistical independence and therefore it searches for linear com- 
binations y of the input multi-frequency data, which maximise 
some measure of the non-Gaussianity. In the specific implemen- 
tation of the idea, employed here, the non-Gaussianity is quanti- 
fied by the neg-entropy. Denoting by H{y) - - j p(y) log p(y)dy 
the entropy associated with the distribution p, we define the neg- 
entropy as. 



neg-entropy(y) = H(yc) - H(y) , 



(A.7) 



where yc is a Gaussian variable with the same covariance ma- 
trix as y. The search for the maxima of the neg-entropy is usu- 
ally aided by enhancing the role of the higher order moments 
of y, which is achieved by means of a non-linear mapping. In 
the present implementation, the FastICA finds the extrema of 
the neg-entropy approximation given by |£'[^(y)] - E[g(yQ)]\^, 
where E means the average over the pixels, and g represents 
the non-linear mapping of the data, which may be a power law 
in the simplest case. The algorithm is straightforwardly imple- 
mented in real space, and requires the same angular resolution 
for all channels. Note that for an experiment like Planck where 
the resolution varies with frequency, this requires smoothing the 
input data to the lowest resolution before processing. The use 
of an efficient minimization procedure, with a required number 
of floating point operations scaling linearly with the size of the 
data set, makes the computational requirements essentially dom- 
inated by memory needed to be allocated to quickly access the 
multi-frequency data. 

The algorithm has been tested so far as a CMB cleaning pro- 
cedure, because the hypothesis of statistical independence is ex- 
pected to be verified at least between CMB and diffuse fore- 
grounds. It produced results on real (BEAST, COBE, WMAP) 
and simulated total intensity data, as well as on polarization sim- 
ulations, on patches as well as all sky (see iMaino et al.l (|2007|) 
and references therein). The performance is made possible by 



two contingencies, i.e. the validity of the assumption of statis- 
tical independence for CMB and foregrounds, as well as the 
high resolution of the present CMB observations, which pro- 
vides enough of statistical realizations (pixels) for the method 
to decompose the data into the independent components. 

A.5. Harmonic-space maximum entropy method (FastMEM) 

The Maximum Entropy Method (MEM) can be used to sep- 
arate the CMB signal from astrophysical foregrounds includ- 
ing Galactic synchrotron, dust and free-free emission as well as 
SZ effects. The particular implementation of MEM used here 
works in the spherical harmonic domain. The separation is per- 
formed mode-by-mode allowing a huge optimisation problem to 
be split into a number of smaller problems. The solution can 
thus be obtained more rapidly, giving this impl ementation its 
name: FastMEM. This approach is described by iHobson et al.l 
( 19981 1999) fo r Fourier mo des on flat patches of the sky and by 
IStolvarov et akl (l2002ll2005l) for the full-sky case. 

If we have a model (or hypothesis) H in which the measured 
data d is a function of an underlying signal s, then Bayes' theo- 
rem tells us that the posterior probability PT{s\d, H) is the prod- 
uct of the likelihood Pr(d|s, H) and the prior probability Pr(s, H), 
divided by the evidence Pr(d, H), 



Pr(s|d,//) = 



Pr(d|s,//)Pr(s|//) 
Pr(d|//) ■ 



(A.8) 



The objective here is to maximise the posterior probability of 
the signal given the data. Since the evidence in Bayes' theorem 
is merely a normalisation constant we maximise the product of 
the likelihood and the prior 



Pr(s|d, H) oc Pr(d|s, H) Pr(s, H). 



(A.9) 



We assume that the instrumental noise in each frequency channel 
is Gaussian -distributed, so that the log-likelihood has a form of a 
misfit statistic. We make the assumption that the noise is un- 
correlated between spherical harmonic modes. We also assume 
that the beams are azimuthally symmetric, so that they are fully 
described by the beam transfer function B( in harmonic space. 
For mode ((, m), the log-likelihood is 



(A. 10) 



where A is the fixed frequency conversion matrix which de- 
scribes how the components are mixed to form the data, and A^^^' 
is the inverse noise covariance matrix for this mode. If the instru- 
mental noise is uncorrelated between channels, then this matrix 
is diagonal. However, unresolved point sources can be modelled 
as a correlated noise component. 

The prior can be Gaussian, and in this case we recover the 
Wiener filter with the well-known analytical solution for the sig- 
nal s. However, the astrophysical components have strongly non- 
Gaussian distribution, especially in the Galactic plane. Therefore 
Hobson et al. ( 1998) suggested that an entropic prior be used in- 
stead. In this case, maximising the posterior probability is equiv- 
alent to the minimising the following functional for each spheri- 
cal harmonic mode 



<l>MEM(S^m) (Sfm) - aS (SCm) 



(A.ll) 



where S (s) is the entropic term, and a is the regularisation pa- 
rameter. The minimisation can be done numerically using one of 
a number of algorithms (Press et al. 1992). 

FastMEM is a non-blind method, so the spectral behaviour 
of the components must be known in advance. Since A is fixed. 
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the spectral properties of the components must be the same ev- 
erywhere on the sky. However, small variations in the spectral 
properties, for example, dust temperature, synchrotron spectral 
index or SZ cluster electron temperature, can be accounted for 
by introducing additional components. These additional compo- 
nents correspond to terms in the Taylor expansion of the fre- 
quency spectrum with respect to the relevant parameter. 

The initial priors on the components are quite flexible and 
they can be updated by iterating the component separation, es- 
pecially if the signal-to-noise is high enough. 

It is not necessary for all of the input maps to be at the same 
resolution since FastMEM solves for the most probable solu- 
tion for unsmoothed signal, deconvolving and denoising maps 
simultaneously. It is flexible enough to include any datasets with 
known window function and noise properties. A mask can eas- 
ily be applied to the input data (the same mask for all frequency 
channels) and this does not cause problems with the separation. 

Since FastMEM uses priors on the signals, the solution for 
the signals is biased. This is especially evident if the signal-to- 
noise ratio is low. It is possible to de-bias the power spectrum 
statistically, knowing the priors and the FastMEM separation er- 
rors per mode. However, one can not de-bias the recovered maps 
since the errors are quadratic and de-biasing will introduce phase 
errors in the harmonics. 

No information about the input components was used in the 
separation, and the prior power spectra were based solely on 
the physical properties of the components and templates avail- 
able in the literature. The prior on the CMB component was 
set using the best-fit theoretical spectrum, instead of a WMAP- 
constrained realisation. This has a significant effect at low mul- 
tipoles. 

A.6. Spectral estimation via expectation-maximization 
(SEVEM) 

SEVEM jMartmez-Gonzalez et al.l l2003l) tries to recover only 
the CMB signal, treating the rest of the emissions as a gener- 
alised noise. As a first step, the cosmological frequency maps, 
100, 143 and 217 GHz, are foreground cleaned using an inter- 
nal template fitting technique. Four templates are obtained from 
the difference of two consecutive frequency channels, which are 
smoothed down to the same resolution if necessary,to avoid the 
presence of CMB signal in the templates. In particular, we con- 
struct maps of (30-44), (44-70), (545-353) and (857-545) differ- 
ences. The central frequency channels are then cleaned by sub- 
tracting a linear combination of these templates. The coefficients 
of this combination are obtained minimising the variance of the 
final clean map outside the considered mask. The second step 
consists of estimating the power spectrum of the CMB from the 
three cleaned maps using the method ( based on the Expectation- 
Maxim ization algorithm) described in iMartmez-Gonzalez et al.l 
(l2003h . which has been adapted to deal with spherical data. 
Using simulations of CMB plus noise, processed in the same 
way as the Challenge data, we obtain the bias and statistical er- 
ror of the estimated power spectrum and construct an unbiased 
version of the Cf's of the CMB. This unbiased power spectrum 
is used to recover the CMB map from the three clean channels 
through Wiener filter in harmonic space. Finally, we estimate the 
noise per pixel of the reconstructed map using CMB plus noise 
simulations. 

One of the advantages of SEVEM is that it does not need 
any external data set or need to make any assumptions about the 
frequency dependence or the power spectra of the foregrounds, 
other than the fact that they are the dominant contribution at the 



lowest and highest frequency channels. This makes the method 
very robust and, therefore, it is expected to perform well for real 
Planck data. Moreover, SEVEM provides a good recovery of the 
power spectrum up to relatively high { and a small error in the 
CMB map reconstruction. In addition the method is very fast, 
which allows one to characterise the eiTors of the CMB power 
spectrum and map using simulations. The cleaning of the data 
takes around 20 minutes, while the estimation of the power spec- 
trum and map requires around 15 and 30 minutes respectively. 
In fact, the whole process described, including producing sim- 
ulations to estimate the bias and errors, takes around 30 hours 
on one single CPU. Regarding weak points, the method recon- 
structs only the CMB and does not try to recover any other com- 
ponent of the microwave sky although it could be generalised 
to reconstruct simultaneously the both the CMB and the thermal 
SZ effect. Also, the reconstructed CMB map is not full-sky, since 
the method does not aim to remove the strong contamination at 
the centre of the Galactic plane or at the point source positions. 
In any case, the masked region excluded for the analysis is rela- 
tively small. 

A. 7. Spectrai matcliing independent component analysis 
(SMICA) 

The principle of SMICA can be summarised in three steps: 1) 
Compute spectral statistics. 2) Fit a component-based model to 
them. 3) Use the result to implement a Wiener filter in harmonic 
space. More specifically, an idealised operation goes as follows. 
Denote d(^) the column vector whose i-th entry contains the ob- 
servation in direction ^ for the i-th channel and denote d(„, the 
vector of same size (the number of frequency channels) in har- 
monic space. This is modelled as the superposition of C com- 
ponents d(m - Tic=i *^fm- ^^^P 1)' compute spectral ma- 
trices Cf = 2^ Yjm df„,df„,- In Step 2) we model the ensemble- 
averaged spectral matrix Ce = (Ce) as the superposition of C un- 
coiTelated components: C[ = T,c=i ^'^^ s^ch component, 

we postulate a parametric model, that is, we let the matrix set 
{C^}^°!g be a function of a parameter vector ff^. This parameteri- 
zation embodies our prior knowledge about a given component. 
For instance, for the CMB component, we take [C™''],j = eiejcc 
where e, is the known CMB emmission coefficient for channel 
/ and cc is the unknown angular power spectrum at frequency {. 
The parameter vector for CMB would then be (f"* - {cef^'^. 
All unknown parameters for all components are then estimated 
by fitting the model to the spectral statistics, i.e. by solving 

mme^_gcZto(^£ + D K[ Q | ZL^C^iff-l ]where K[Ci\C2] 
is a measure of mismatch between two covariance matrices Ci 
and C2. The resulting values 6^, . . .,6'" provide estimates Cj-iff^) 
of C^. The Wiener filter estimate of d^^^ can be expressed as 

^f/ii ~ C^C^'df,,,. In practice, we use the fitted spectral matrices 
estimated at the previous step: component c is estimated as 

d?,„ = C?(0'')Cf(0)-'df„, (A. 12) 

and the maps of each component in each channel are finally com- 
puted by inverse spherical harmonic transforms. 

For processing the cuiTent data set, we have used a model 
containing four components: the CMB, the SZ component, a 4- 
dimensional Galactic component and a noise component. 

The actual processing includes several modifications with re- 
spect to this outline: a) beam correction applied to each spectral 
matrix Cf, b) spectral binning by which the (beam corrected) 
spectral matrices are averaged over bins of increasing lengths; 
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c) localization implemented via aopdised masks, by which the 
SMICA process is conducted independently over two different 
sky zones. 

Strengths: a) No prior information used regarding Galactic 
emission, b) Accurate recovery of the CMB via Wiener filtering, 
c) It is a relatively fast algorithm, d) Built-in goodness of fit. 

Weaknesses: a) The results reported here do not account for 
the contribution of point sources for which a convenient model 
is lacking, b) Localization in two zones is probably too crude, c) 
No separation of Galactic components. 

A.8. Wavelet-based high-resolution Fitting of Internal 
Templates (WI-FIT) 

WI-FIT (iHansen et alJl2006h is based on fitting and subtraction 
of internal templates. Regular (external) template fitting uses ex- 
ternal templates of Galactic components based on observations 
at frequencies different from the ones used to study the CMB. 
These templates are fitted to CMB data, the best fit coefficients 
for each component are found and the templates are subtracted 
from the map using these coefficients in order to obtain a clean 
CMB map. WI-FIT differs from this procedure in two respects: 
(1) It does not rely on external observations of the galaxy but 
forms templates by taking the difference of CMB maps at dif- 
ferent channels. The CMB temperature is equal at different fre- 
quencies whereas the Galactic components are not. For this rea- 
son, the difference maps contain only a sum of Galactic compo- 
nents. A set of templates are constructed from difference maps 
based on different combinations of channels. (2) The fitting of 
the templates are done in wavelet space where the uncertainty 
on the foreground coefficients is much lower than a similar pixel 
based approach (in the pixel based approach, no pixel-pixel cor- 
relations are taken into account since the correlation matrix will 
become to large for PLANCK-like data sets. In the wavelet based 
approach, a large part of these correlations are taken into account 
in scale-scale covariance matrices). 

For calibration purposes, a set of 500 simulated CMB maps 
need to be produced and the full wavelet fitting procedure ap- 
plied to all maps. This is where most CPU time goes. For Planck 
resolution maps, around 1 Gb of memory is necessary to apply 
WI-FIT and a total of around 400 CPU hours are required. 

The strength of WI-FIT is that it relies on very few assump- 
tions about the Galactic components. WI-FIT does however as- 
sume that the spectral indices do not vary strongly from pixel 
to pixel within the frequency range used in the analysis. If this 
assumption is wrong then WI-FIT leaves residuals in the areas 
where there are strongly varying spectral indices. 

Another advantage of WI-FIT is that it is easy to apply and is 
completely linear, i.e. the resulting map is a linear combination 
of frequency channels with well known noise and beam prop- 
erties. This will in general result in increased noise variance in 
the cleaned map. In order to avoid this, we smooth the inter- 
nal templates in order to make the noise at small scales negli- 
gable and at the same time not make significant changes to the 
shape of the diffuse foregrounds. If the diffuse foregrounds turn 
out to be important at small scales I > 300, the smoothing of 
the internal templates will significantly reduce the ability of WI- 
FIT to perform foreground cleaning at these scales. Tests on the 
WMAP data have shown that diffuse foregrounds do not seem to 
play an important role at such small scales. This is valid for the 
frequency range observed by WMAP (i.e. at LFI-frequencies), 
similar tests will need to be made for the Planck HFI data. 

Finally, WI-FIT does not do anything to the point sources, 
which need to be masked. 



